A Multi-Core Pipelined Architecture for Parallel Computing

نویسندگان

Duoduo Liao

Simon Y. Berkovich

چکیده

Parallel programming on multi-core processors has become the industry’s biggest software challenge. This paper proposes a novel parallel architecture for executing sequential programs using multi-core pipelining based on program slicing by a new memory/cache dynamic management technology. The new architecture is very suitable for processing large geospatial data in parallel without parallel programming. This paper presents a new architecture for parallel computation that addresses the problem of requiring to relocate data from one memory hierarchy to another in a multi-core environment. A new memory management technology inserts a layer of abstraction between the processor and the memory hierarchy, allowing the data to stay in one place while the processor effectively migrates as tasks change. The new architecture can make full use of the pipeline and automatically partition data then schedule them onto multi-cores through the pipeline. The most important advantage of this architecture is that most existing sequential programs can be directly used with nearly no change, unlike conventional parallel programming which has to take into account scheduling, load balancing, and data distribution. The new parallel architecture can also be successfully applied to other multi-core/many-core architectures or heterogeneous systems. In this paper, the design of the new multi-core architecture is described in detail. The time complexity and performance analysis are discussed in depth. The experimental results and performance comparison with existing multi-core architectures demonstrate the effectiveness, flexibility, and diversity of the new architecture, in particular, for Big Data parallel processing. KeywordsMulti-Core Architecture; Pipelining; Sequential Programs; Program Slicing; Crossbar Switching; Parallel Computing; Big Data

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient parallelization of the genetic algorithm solution of traveling salesman problem on multi-core and many-core systems

Efficient parallelization of genetic algorithms (GAs) on state-of-the-art multi-threading or many-threading platforms is a challenge due to the difficulty of schedulation of hardware resources regarding the concurrency of threads. In this paper, for resolving the problem, a novel method is proposed, which parallelizes the GA by designing three concurrent kernels, each of which running some depe...

متن کامل

Multi-objective exploitation of pipeline parallelism using clustering, replication and duplication in embedded multi-core systems

With the popularity of mobile device, people require more computing power to run emerging applications. However, the increase in power consumption is a major problem because power is quite limited in embedded systems. Our goal is to consider power consumption along with latency and throughput. We proposed a heuristic algorithm, called Parallel Pipeline Latency Optimization for high performance ...

متن کامل

HEVC Hardware Decoder Implementation for UHD Video Applications

In this paper, an efficient hardware architecture that exploits parallel processing for HEVC decoders is proposed by introducing (i) a Coding Tree Unit (CTU)-level pipelined architecture for single-core based processing; and (ii) a multi-core based parallel processing architecture for picture partition decoding with low latency while not requiring additional resources for in-loop filtering (ILF...

متن کامل

New debugging concept for symmetric multiprocessing (SMP)

However, for the parallelization of tasks not necessarily a multi-core processor is required. Hardware multithreading, for example, is an approach that enables parallelization also for single-core processors. Here, we deal with a basic problem of cores with pipeline architecture: cache misses or data dependencies between the instructions mean that the pipelined instruction processing has to be ...

متن کامل

Efficient implementation of low time complexity and pipelined bit-parallel polynomial basis multiplier over binary finite fields

This paper presents two efficient implementations of fast and pipelined bit-parallel polynomial basis multipliers over GF (2m) by irreducible pentanomials and trinomials. The architecture of the first multiplier is based on a parallel and independent computation of powers of the polynomial variable. In the second structure only even powers of the polynomial variable are used. The par...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

A Multi-Core Pipelined Architecture for Parallel Computing

نویسندگان

چکیده

منابع مشابه

Efficient parallelization of the genetic algorithm solution of traveling salesman problem on multi-core and many-core systems

Multi-objective exploitation of pipeline parallelism using clustering, replication and duplication in embedded multi-core systems

HEVC Hardware Decoder Implementation for UHD Video Applications

New debugging concept for symmetric multiprocessing (SMP)

Efficient implementation of low time complexity and pipelined bit-parallel polynomial basis multiplier over binary finite fields

عنوان ژورنال:

اشتراک گذاری